Decoupled Asynchronous Proximal Stochastic Gradient Descent with Variance Reduction

نویسندگان

  • Zhouyuan Huo
  • Bin Gu
  • Heng Huang
چکیده

In the era of big data, optimizing large scale machine learning problems becomes a challenging task and draws significant attention. Asynchronous optimization algorithms come out as a promising solution. Recently, decoupled asynchronous proximal stochastic gradient descent (DAP-SGD) is proposed to minimize a composite function. It is claimed to be able to offload the computation bottleneck from server to workers by allowing workers to evaluate the proximal operators, therefore, server just need to do element-wise operations. However, it still suffers from slow convergence rate because of the variance of stochastic gradient is not zero. In this paper, we propose a faster method, decoupled asynchronous proximal stochastic variance reduced gradient descent method (DAP-SVRG). We prove that our method has linear convergence for strongly convex problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asynchronous Stochastic Proximal Optimization Algorithms with Variance Reduction

Regularized empirical risk minimization (R-ERM) is an important branch of machine learning, since it constrains the capacity of the hypothesis space and guarantees the generalization ability of the learning algorithm. Two classic proximal optimization algorithms, i.e., proximal stochastic gradient descent (ProxSGD) and proximal stochastic coordinate descent (ProxSCD) have been widely used to so...

متن کامل

Asynchronous Doubly Stochastic Proximal Optimization with Variance Reduction

In the big data era, both of the sample size and dimension could be huge at the same time. Asynchronous parallel technology was recently proposed to handle the big data. Specifically, asynchronous stochastic (variance reduction) gradient descent algorithms were recently proposed to scale the sample size, and asynchronous stochastic coordinate descent algorithms were proposed to scale the dimens...

متن کامل

Make Workers Work Harder: Decoupled Asynchronous Proximal Stochastic Gradient Descent

Asynchronous parallel optimization algorithms for solving large-scale machine learning problems havedrawn significant attention from academia to industry recently. This paper proposes a novel algorithm,decoupled asynchronous proximal stochastic gradient descent (DAP-SGD), to minimize an objective func-tion that is the composite of the average of multiple empirical losses and a r...

متن کامل

Asynchronous Stochastic Gradient Descent with Variance Reduction for Non-Convex Optimization

We provide the first theoretical analysis on the convergence rate of the asynchronous stochastic variance reduced gradient (SVRG) descent algorithm on nonconvex optimization. Recent studies have shown that the asynchronous stochastic gradient descent (SGD) based algorithms with variance reduction converge with a linear convergent rate on convex problems. However, there is no work to analyze asy...

متن کامل

Asynchronous Stochastic Proximal Methods for Nonconvex Nonsmooth Optimization

We study stochastic algorithms for solving non-convex optimization problems with a convex yet possibly non-smooth regularizer, which nd wide applications in many practical machine learning applications. However, compared to asynchronous parallel stochastic gradient descent (AsynSGD), an algorithm targeting smooth optimization, the understanding of the behavior of stochastic algorithms for the n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1609.06804  شماره 

صفحات  -

تاریخ انتشار 2016